skip to main content


Search for: All records

Creators/Authors contains: "Jiang, Jiming"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Free, publicly-accessible full text available January 1, 2025
  2. Abstract

    Subgenome dominance has been reported in diverse allopolyploid species, where genes from one subgenome are preferentially retained and are more highly expressed than those from other subgenome(s). However, the molecular mechanisms responsible for subgenome dominance remain poorly understood. Here, we develop genome-wide map of accessible chromatin regions (ACRs) in cultivated strawberry (2n = 8x = 56, with A, B, C, D subgenomes). Each ACR is identified as an MNase hypersensitive site (MHS). We discover that the dominant subgenome A contains a greater number of total MHSs and MHS per gene than the submissive B/C/D subgenomes. Subgenome A suffers fewer losses of MHS-related DNA sequences and fewer MHS fragmentations caused by insertions of transposable elements. We also discover that genes and MHSs related to stress response have been preferentially retained in subgenome A. We conclude that preservation of genes and their cognate ACRs, especially those related to stress responses, play a major role in the establishment of subgenome dominance in octoploid strawberry.

     
    more » « less
  3. Abstract Objectives

    Lavandula angustifolia(English lavender) is commercially important not only as an ornamental species but also as a major source of fragrances. To better understand the genomic basis of chemical diversity in lavender, we sequenced, assembled, and annotated the ‘Munstead’ cultivar ofL. angustifolia.

    Data description

    A total of 80 Gb of Oxford Nanopore Technologies reads was used to assemble the ‘Munstead’ genome using the Canu genome assembler software. Following multiple rounds of error correction and scaffolding using Hi-C data, the final chromosome-scale assembly represents 795,075,733 bp across 25 chromosomes with an N50 scaffold length of 31,371,815 bp. Benchmarking Universal Single Copy Orthologs analysis revealed 98.0% complete orthologs, indicative of a high-quality assembly representative of genic space. Annotation of protein-coding sequences revealed 58,702 high-confidence genes encoding 88,528 gene models. Access to the ‘Munstead’ genome will permit comparative analyses within and among lavender accessions and provides a pivotal species for comparative analyses within Lamiaceae.

     
    more » « less
  4. Transcriptional divergence of duplicated genes after whole genome duplication (WGD) has been described in many plant lineages and is often associated with subgenome dominance, a genome-wide mechanism. However, it is unknown what underlies the transcriptional divergence of duplicated genes in polyploid species that lack subgenome dominance. Soybean is a paleotetraploid with a WGD that occurred 5 to 13 Mya. Approximately 50% of the duplicated genes retained from this WGD exhibit transcriptional divergence. We developed accessible chromatin region (ACR) datasets from leaf, flower, and seed tissues using MNase-hypersensitivity sequencing. We validated enhancer function of several ACRs associated with known genes using CRISPR/Cas9-mediated genome editing. The ACR datasets were used to examine and correlate the transcriptional patterns of 17,111 pairs of duplicated genes in different tissues. We demonstrate that ACR dynamics are correlated with divergence of both expression level and tissue specificity of individual gene pairs. Gain or loss of flanking ACRs and mutation ofcis-regulatory elements (CREs) within the ACRs can change the balance of the expression level and/or tissue specificity of the duplicated genes. Analysis of DNA sequences associated with ACRs revealed that the extensive sequence rearrangement after the WGD reshaped the CRE landscape, which appears to play a key role in the transcriptional divergence of duplicated genes in soybean. This may represent a general mechanism for transcriptional divergence of duplicated genes in polyploids that lack subgenome dominance.

     
    more » « less
    Free, publicly-accessible full text available October 31, 2024
  5. We propose a new classified mixed model prediction (CMMP) procedure, called pseudo-Bayesian CMMP,that uses network information in matching the group index between the training data and new data, whosecharacteristics of interest one wishes to predict. The current CMMP procedures do not incorporate suchinformation; as a result, the methods are not consistent in terms of matching the group index. Although, asthe number of training data groups increases, the current CMMP method can predict the mixed effects ofinterest consistently, its accuracy is not guaranteed when the number of groups is moderate, as is the case inmany potential applications. The proposed pseudo-Bayesian CMMP procedure assumes a flexible workingprobability model for the group index of the new observation to match the index of a training data group,which may be viewed as a pseudo prior. We show that, given any working model satisfying mild conditions,the pseudo-Bayesian CMMP procedure is consistent and asymptotically optimal both in terms of matchingthe group index and in terms of predicting the mixed effect of interest associated with the new observations.The theoretical results are fully supported by results of empirical studies, including Monte-Carlo simulationsand real-data validation. 
    more » « less
    Free, publicly-accessible full text available July 3, 2024
  6. Abstract

    We develop a method of classified mixed model prediction based on generalized linear mixed models that incorporate pseudo‐prior information to improve prediction accuracy. We establish consistency of the proposed method both in terms of prediction of the true mixed effect of interest and in terms of correctly identifying the potential class corresponding to the new observations if such a class matching one of the training data classes exists. Empirical results, including simulation studies and real‐data validation, fully support the theoretical findings.

     
    more » « less
    Free, publicly-accessible full text available June 1, 2024
  7. Tribble, C (Ed.)
    Abstract The majority of sequenced genomes in the monocots are from species belonging to Poaceae, which include many commercially important crops. Here, we expand the number of sequenced genomes from the monocots to include the genomes of 4 related cyperids: Carex cristatella and Carex scoparia from Cyperaceae and Juncus effusus and Juncus inflexus from Juncaceae. The high-quality, chromosome-scale genome sequences from these 4 cyperids were assembled by combining whole-genome shotgun sequencing of Nanopore long reads, Illumina short reads, and Hi-C sequencing data. Some members of the Cyperaceae and Juncaceae are known to possess holocentric chromosomes. We examined the repeat landscapes in our sequenced genomes to search for potential repeats associated with centromeres. Several large satellite repeat families, comprising 3.2–9.5% of our sequenced genomes, showed dispersed distribution of large satellite repeat clusters across all Carex chromosomes, with few instances of these repeats clustering in the same chromosomal regions. In contrast, most large Juncus satellite repeats were clustered in a single location on each chromosome, with sporadic instances of large satellite repeats throughout the Juncus genomes. Recognizable transposable elements account for about 20% of each of the 4 genome assemblies, with the Carex genomes containing more DNA transposons than retrotransposons while the converse is true for the Juncus genomes. These genome sequences and annotations will facilitate better comparative analysis within monocots. 
    more » « less
  8. Abstract

    We derive precise asymptotic results that are directly usable for confidence intervals and Wald hypothesis tests for likelihood-based generalized linear mixed model analysis. The essence of our approach is to derive the exact leading term behaviour of the Fisher information matrix when both the number of groups and number of observations within each group diverge. This leads to asymptotic normality results with simple studentizable forms. Similar analyses result in tractable leading term forms for the determination of approximate locally D-optimal designs.

     
    more » « less